Search for: All records

Creators/Authors contains: "Matuszek, Cynthia"

« Prev Next »

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

GPT-4 as a Moral Reasoner for Robot Command Rejection

Wen, Ruchen; Ferrario, Francis; Matuszek, Cynthia (November 2024, Proceedings of the 12th International Conference on Human-Agent Interaction)

To support positive, ethical human-robot interactions, robots need to be able to respond to unexpected situations in which societal norms are violated, including rejecting unethical commands. Implementing robust communication for robots is inherently difficult due to the variability of context in real-world settings and the risks of unintended influence during robots’ communication. HRI researchers have begun exploring the potential use of LLMs as a solution for language-based communication, which will require an in-depth understanding and evaluation of LLM applications in different contexts. In this work, we explore how an existing LLM responds to and reasons about a set of norm-violating requests in HRI contexts. We ask human participants to assess the performance of a hypothetical GPT-4-based robot on moral reasoning and explanatory language selection as it compares to human intuitions. Our findings suggest that while GPT-4 performs well at identifying norm violation requests and suggesting non-compliant responses, its flaws in not matching the linguistic preferences and context sensitivity of humans prevent it from being a comprehensive solution for moral communication between humans and robots. Based on our results, we provide a four-point recommendation for the community in incorporating LLMs into HRI systems.
more » « less
Full Text Available
Domain-Specific Retrieval-Augmented Generation Using Vector Stores, Knowledge Graphs, and Tensor Factorization

Barron, Ryan C; Grantcharov, Vesselin; Wanna, Selma; Eren, Maksim E; Bhattarai, Manish; Solovyev, Nicholas; Tompkins, George; Nicholas, Charles; Rasmussen, Kim O; Matuszek, Cynthia; et al (December 2024, Proceedings of the 2024 International Conference on Machine Learning and Applications (ICMLA))

Large Language Models (LLMs) are pre-trained on large-scale corpora and excel in numerous general natural language processing (NLP) tasks, such as question answering (QA). Despite their advanced language capabilities, when it comes to domain-specific and knowledge-intensive tasks, LLMs suffer from hallucinations, knowledge cut-offs, and lack of knowledge attributions. Additionally, fine tuning LLMs' intrinsic knowledge to highly specific domains is an expensive and time consuming process. The retrieval-augmented generation (RAG) process has recently emerged as a method capable of optimization of LLM responses, by referencing them to a predetermined ontology. It was shown that using a Knowledge Graph (KG) ontology for RAG improves the QA accuracy, by taking into account relevant sub-graphs that preserve the information in a structured manner. In this paper, we introduce SMART-SLIC, a highly domain-specific LLM framework, that integrates RAG with KG and a vector store (VS) that store factual domain specific information. Importantly, to avoid hallucinations in the KG, we build these highly domain-specific KGs and VSs without the use of LLMs, but via NLP, data mining, and nonnegative tensor factorization with automatic model selection. Pairing our RAG with a domain-specific: (i) KG (containing structured information), and (ii) VS (containing unstructured information) enables the development of domain-specific chat-bots that attribute the source of information, mitigate hallucinations, lessen the need for fine-tuning, and excel in highly domain-specific question answering tasks. We pair SMART-SLIC with chain-of-thought prompting agents. The framework is designed to be generalizable to adapt to any specific or specialized domain. In this paper, we demonstrate the question answering capabilities of our framework on a corpus of scientific publications on malware analysis and anomaly detection.
more » « less
Full Text Available
A Large Model’s Ability to Identify 3D Objects as a Function of Viewing Angle

https://doi.org/10.1109/AIxVR59861.2024.00006

Rubinstein, Jacob; Ferraro, Francis; Matuszek, Cynthia; Engel, Don (January 2024, Proceedings of the IEEE Artificial Intelligence x Virtual Reality (AIxVR) Conference)

Virtual reality is progressively more widely used to support embodied AI agents, such as robots, which frequently engage in ‘sim-to-real’ based learning approaches. At the same time, tools such as large vision-and-language models offer new capabilities that tie into a wide variety of tasks and capabilities. In order to understand how such agents can learn from simulated environments, we explore a language model’s ability to recover the type of object represented by a photorealistic 3D model as a function of the 3D perspective from which the model is viewed. We used photogrammetry to create 3D models of commonplace objects and rendered 2D images of these models from an fixed set of 420 virtual camera perspectives. A well-studied image and language model (CLIP) was used to generate text (i.e., prompts) corresponding to these images. Using multiple instances of various object classes, we studied which camera perspectives were most likely to return accurate text categorizations for each class of object.
more » « less
Full Text Available
Scarecrows in Oz: The Use of Large Language Models in HRI

Williams, Tom; Matuszek, Cynthia; Mead, Ross; DePalma, Nick (January 2024, ACM transactions on humanrobot interaction)

The proliferation of Large Language Models (LLMs) presents both a critical design challenge and a remarkable opportunity for the field of Human-Robot Interaction (HRI). While the direct deployment of LLMs on interactive robots may be unsuitable for reasons of ethics, safety, and control, LLMs might nevertheless provide a promising baseline technique for many elements of HRI. Specifically, in this position paper, we argue for the use of LLMs as Scarecrows: ‘brainless,’ straw-man black-box modules integrated into robot architectures for the purpose of quickly enabling full-pipeline solutions, much like the use of “Wizard of Oz” (WoZ) and other human-in-the-loop approaches. We explicitly acknowledge that these Scarecrows, rather than providing a satisfying or scientifically complete solution, incorporate a form of the wisdom of the crowd, and, in at least some cases, will ultimately need to be replaced or supplemented by a robust and theoretically motivated solution. We provide examples of how Scarecrows could be used in language-capable robot architectures as useful placeholders, and suggest initial reporting guidelines for authors, mirroring existing guidelines for the use and reporting of WoZ techniques.
more » « less
Full Text Available
Does Starting Deep Learning Homework Earlier Improve Grades?

Raff, Edward; Matuszek, Cynthia (October 2023, Proceedings of the European Conference on Artificial Intelligence (ECAI))

Intuitively, students who start a homework assignment earlier and spend more time on it should receive better grades on the assignment. However, existing literature on the impact of time spent on homework is not clear-cut and comes mostly from K-12 education. It is not clear that these prior studies can inform coursework in deep learning due to differences in demographics, as well as the computational time needed for assignments to be completed. We study this problem in a post-hoc study of three semesters of a deep learning course at the University of Maryland, Baltimore County (UMBC), and develop a hierarchical Bayesian model to help make principled conclusions about the impact on student success given an approximate measure of the total time spent on the homework, and how early they submitted the assignment. Our results show that both submitting early and spending more time positively relate with final grade. Surprisingly, the value of an additional day of work is apparently equal across students, even when some require less total time to complete an assignment.
more » « less
Full Text Available
Measuring Equality in Machine Learning Security Defenses: A Case Study in Speech Recognition

Richards, Luke E.; Raff, Edward; Matuszek, Cynthia (November 2023, Proceedings of the ACM Workshop on Artificial Intelligence and Security (AISec))

Over the past decade, the machine learning security community has developed a myriad of defenses for evasion attacks. An under- studied question in that community is: for whom do these defenses defend? This work considers common approaches to defending learned systems and how security defenses result in performance inequities across different sub-populations. We outline appropriate parity metrics for analysis and begin to answer this question through empirical results of the fairness implications of machine learning security methods. We find that many methods that have been proposed can cause direct harm, like false rejection and unequal benefits from robustness training. The framework we propose for measuring defense equality can be applied to robustly trained models, preprocessing-based defenses, and rejection methods. We identify a set of datasets with a user-centered application and a reasonable computational cost suitable for case studies in measuring the equality of defenses. In our case study of speech command recognition, we show how such adversarial training and augmentation have non-equal but complex protections for social subgroups across gender, accent, and age in relation to user coverage. We present a comparison of equality between two rejection-based de- fenses: randomized smoothing and neural rejection, finding randomized smoothing more equitable due to the sampling mechanism for minority groups. This represents the first work examining the disparity in the adversarial robustness in the speech domain and the fairness evaluation of rejection-based defenses.
more » « less
Full Text Available
Multimodal Language Learning for Object Retrieval in Low Data Regimes in the Face of Missing Modalities

Darvish, Kasra; Raff, Edward; Ferraro, Francis; Matuszek, Cynthia (October 2023, Transactions on machine learning research)

Our study is motivated by robotics, where when dealing with robots or other physical systems, we often need to balance competing concerns of relying on complex, multimodal data coming from a variety of sensors with a general lack of large representative datasets. Despite the complexity of modern robotic platforms and the need for multimodal interaction, there has been little research on integrating more than two modalities in a low data regime with the real-world constraint that sensors fail due to obstructions or adverse conditions. In this work, we consider a case in which natural language is used as a retrieval query against objects, represented across multiple modalities, in a physical environment. We introduce extended multimodal alignment (EMMA), a method that learns to select the appropriate object while jointly refining modality-specific embeddings through a geometric (distance-based) loss. In contrast to prior work, our approach is able to incorporate an arbitrary number of views (modalities) of a particular piece of data. We demonstrate the efficacy of our model on a grounded language object retrieval scenario. We show that our model outperforms state-of-the-art baselines when little training data is available. Our code is available at https://github.com/kasraprime/EMMA
more » « less
Full Text Available
Photogrammetry and VR for Comparing 2D and Immersive Linguistic Data Collection (Student Abstract)

https://doi.org/10.1609/aaai.v37i13.27016

Rubinstein, Jacob; Matuszek, Cynthia; Engel, Don (September 2023, Proceedings of the AAAI Conference on Artificial Intelligence)

The overarching goal of this work is to enable the collection of language describing a wide variety of objects viewed in virtual reality. We aim to create full 3D models from a small number of ‘keyframe’ images of objects found in the publicly available Grounded Language Dataset (GoLD) using photogrammetry. We will then collect linguistic descriptions by placing our models in virtual reality and having volunteers describe them. To evaluate the impact of virtual reality immersion on linguistic descriptions of the objects, we intend to apply contrastive learning to perform grounded language learning, then compare the descriptions collected from images (in GoLD) versus our models.
more » « less
Full Text Available
A Collaborative Building Task in VR vs. Reality

Higgins, Padraig; Barron, Ryan; Lukin, Stephanie; Engel, Don; Matuszek, Cynthia (November 2023, Proceedings of the International Symposium on Experimental Robotics)

Human-robot interaction is a critical area of research, providing support for collaborative tasks where a human instructs a robot to interact with and manipulate objects in an environment. However, an under-explored element of these collaborative manipulation tasks are small-scale building exercises, in which the human and robot are working together in close proximity with the same set of objects. Under these conditions, it is essential to ensure the human’s safety and mitigate comfort risks during the interaction. As there is danger in exposing humans to untested robots, a safe and controlled environment is required. Simulation and virtual reality (VR) for HRI have shown themselves to be suitable tools for creating space for human-robot experimentation that can be beneficial in these scenarios. However, the use of simulation and VR comes with the possibility of failures resulting from the sim-to-real gap, where the behavior of the simulated robot may not accurately reflect the experience of a human collaborator in a real-world setting. This gap can limit the generalizability of research findings and raise questions about the validity of using simulation and VR for HRI research. Our goal in this work is to demonstrate the effectiveness of sim-to-real approaches for contact-based human-robot interaction.
more » « less
Full Text Available
Machine Learning Security as a Source of Unfairness in Human-Robot Interaction

Richards, Luke E.; Matuszek, Cynthia (March 2023, Human-Robot Interaction (HRI) Workshop on Inclusive HRI II: Equity and Diversity in Design, Application, Methods, and Community (DEI HRI))

Machine learning models that sense human speech, body placement, and other key features are commonplace in human-robot interaction. However, the deployment of such models in themselves is not without risk. Research in the security of machine learning examines how such models can be exploited and the risks associated with these exploits. Unfortunately, the threat models of risks produced by machine learning security do not incorporate the rich sociotechnical underpinnings of the defenses they propose; as a result, efforts to improve the security of machine learning models may actually increase the difference in performance across different demographic groups, yielding systems that have risk mitigation that work better for one group than another. In this work, we outline why current approaches to machine learning security present DEI concerns for the human-robot interaction community and where there are open areas for collaboration.
more » « less
Full Text Available

« Prev Next »